Karma Tarap & Alexander Noll
26-10-2017
Standard convolutional neural networks (CNNs) are good at global classification:
What if we are interested in asking the question: where is the hotdog?
So we have a local classification problem:
Filter
Convolutional layers use the spatial symmetry of the problem to reduce the number of weights. They do this using the concept of weight sharing: a hot-dog in the upper right is the same as a hot-dog in the lower left corner.
Important parameters:
Summarize patch of an image by taking the maximum or mean (or whatever)
Important parameters:
Architecture of VGG16
Architecture of FCN32
How do we retrieve the size of the original image?
Use upsamling or transposed convolutional or deconvolutional layers to upscale image
Transfer learning with direct up-sampling produced state of the art performance
Used to transfer information (spatial) from early layers directely into layers close to the output layer via elementwise addition.
Fully Convolutional Networks for Semantic Segmentation
Jonathan Long, Evan Shelhamer, Trevor Darrell https://arxiv.org/abs/1411.4038
Demo